Audio-Driven Dubbing for User Generated Contents via Style-Aware Semi-Parametric Synthesis

نویسندگان

چکیده

Existing automated dubbing methods are usually designed for Professionally Generated Content (PGC) production, which requires massive training data and time to learn a person-specific audio-video mapping. In this paper, we investigate an audio-driven method that is more feasible User (UGC) production. There two unique challenges design UGC: 1) the appearances of speakers diverse arbitrary as needs generalize across users; 2) available video one speaker very limited. order tackle above challenges, first introduce new Style Translation Network integrate speaking style target content source via cross-modal AdaIN module. It enables our model quickly adapt speaker. Then, further develop semi-parametric renderer, takes full advantage limited unseen video-level retrieve-warp-refine pipeline. Finally, propose temporal regularization generating continuous videos. Extensive experiments show generates videos accurately preserve various styles, yet with considerably lower amount in comparison existing methods. Besides, achieves faster testing speed than most recent

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semi-parametric Image Synthesis

We present a semi-parametric approach to photographic image synthesis from semantic layouts. The approach combines the complementary strengths of parametric and nonparametric techniques. The nonparametric component is a memory bank of image segments constructed from a training set of images. Given a novel semantic layout at test time, the memory bank is used to retrieve photographic references ...

متن کامل

Interactive Data-Driven Audio Synthesis

We present a method that allows novice users to interactively build, design, and edit audio samples using a datadriven synthesis method. With this tool a user can create an audio stream that matches a visual animation using an input sound to provide the qualitative properties of the new sound. The user can easily shape the output to match the desired behavior by specifying sections of the input...

متن کامل

Audio signal Driven sound synthesis

A new approach to computer music instruments is described. Rather than sense control parameters from acoustic instruments (or non-acoustic instrument controllers), the sound of an acoustic instrument is used directly by a synthesis algorithm, usually replacing an oscillator. Parameters such as amplitude and pitch can control other aspects of the synthesis. This approach gives the player more co...

متن کامل

Exploring data driven parametric synthesis

This paper describes our work on building a formant synthesis system based on both rule generated and database driven methods. Three parametric synthesis systems are discussed: our traditional rule based system, a speaker adapted system, and finally a gesture system. The gesture system is a further development of the adapted system in that it includes concatenated formant gestures from a data-d...

متن کامل

User Driven Two-Dimensional Computer-Generated Ornamentation

Hand drawn ornamentation, such as floral or geometric patterns, is a tedious and time consuming task that requires skill and training in ornamental design principles and aesthetics. Ornamental drawings both historically and presently play critical roles in all things from art to architecture, and when computers handle the repetition and overall structure of ornament, considerable savings in tim...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Circuits and Systems for Video Technology

سال: 2023

ISSN: ['1051-8215', '1558-2205']

DOI: https://doi.org/10.1109/tcsvt.2022.3210002